Banzhaf Random Forests

نویسندگان

  • Jianyuan Sun
  • Guoqiang Zhong
  • Junyu Dong
  • Yajuan Cai
چکیده

Random forests are a type of ensemble method which makes predictions by combining the results of several independent trees. However, the theory of random forests has long been outpaced by their application. In this paper, we propose a novel random forests algorithm based on cooperative game theory. Banzhaf power index is employed to evaluate the power of each feature by traversing possible feature coalitions. Unlike the previously used information gain rate of information theory, which simply chooses the most informative feature, the Banzhaf power index can be considered as a metric of the importance of each feature on the dependency among a group of features. More importantly, we have proved the consistency of the proposed algorithm, named Banzhaf random forests (BRF). This theoretical analysis takes a step towards narrowing the gap between the theory and practice of random forests for classification problems. Experiments on several UCI benchmark data sets show that BRF is competitive with state-of-the-art classifiers and dramatically outperforms previous consistent random forests. Particularly, it is much more efficient than previous consistent random forests.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random forests algorithm in podiform chromite prospectivity mapping in Dolatabad area, SE Iran

The Dolatabad area located in SE Iran is a well-endowed terrain owning several chromite mineralized zones. These chromite ore bodies are all hosted in a colored mélange complex zone comprising harzburgite, dunite, and pyroxenite. These deposits are irregular in shape, and are distributed as small lenses along colored mélange zones. The area has a great potential for discovering further chromite...

متن کامل

Mondrian Forests: Efficient Online Random Forests

Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as ...

متن کامل

Random Forests and Adaptive Nearest Neighbors

In this paper we study random forests through their connection with a new framework of adaptive nearest neighbor methods. We first introduce a concept of potential nearest neighbors (k-PNN’s) and show that random forests can be seen as adaptively weighted k-PNN methods. Various aspects of random forests are then studied from this perspective. We investigate the effect of terminal node sizes and...

متن کامل

Bernoulli Random Forests: Closing the Gap between Theoretical Consistency and Empirical Soundness

Random forests are one type of the most effective ensemble learning methods. In spite of their sound empirical performance, the study on their theoretical properties has been left far behind. Recently, several random forests variants with nice theoretical basis have been proposed, but they all suffer from poor empirical performance. In this paper, we propose a Bernoulli random forests model (BR...

متن کامل

Search for the smallest random forest.

Random forests have emerged as one of the most commonly used nonparametric statistical methods in many scientific areas, particularly in analysis of high throughput genomic data. A general practice in using random forests is to generate a sufficiently large number of trees, although it is subjective as to how large is sufficient. Furthermore, random forests are viewed as "black-box" because of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1507.06105  شماره 

صفحات  -

تاریخ انتشار 2015